On variation of word frequencies in Russian literary texts
نویسنده
چکیده
We study the variation of word frequencies in Russian literary texts. Our findings indicate that the standard deviation of a word’s frequency across texts depends on its average frequency according to a power lawwith exponent 2 < α < 1, which shows that the rarer words have a relatively larger degree of frequency volatility (that is, higher ‘‘burstiness’’). A latent factor model has been estimated to investigate the structure of the word frequency distribution. The findings suggest that the dependence of a word’s frequency volatility on its average frequency can be explained by the asymmetry in the distribution of latent factors. © 2015 Elsevier B.V. All rights reserved.
منابع مشابه
Variation in the vocabulary of Russian literary texts
We study the variation of word frequencies in Russian literary texts. Our findings indicate that the standard deviation of a word’s frequency across texts depends on its average frequency according to a power law with exponent 0.62, showing that the rarer words have a relatively larger degree of frequency volatility (i.e., “burstiness”). Several latent factors models have been estimated to inve...
متن کاملTesting Problems in Russian as a Foreign Language in a Technical University
Problems of theory and practice of the Russian as a foreign language testing for entrants in technical universities are considered. The benefits of test forms for controlling the foreign students’ skills in the Russian language during a hard time limit are presented. The structure and content of the tests, all types of tasks offered on the entrance and final examinations in the Russian languag...
متن کاملThe evolution of the meaning of the word nurse based on the classical texts of Persian literature
Background and Aim: The semantic evolution of a word over time is inevitable, indicating a social, political, religious or cultural process. Nurse is one of the words that has a significant presence in Persian literature texts and has been used in many different meanings such as slave, servan, maid, devotee, obedient, patient and preserver. The purpose of this study is to show its semantic ev...
متن کاملAn acquired taste: How reading literature affects sensitivity to word distributions when judging literary texts
This study examines how reading habits affect people’s sensitivity to word distributions in literary and non-literary writing. We manipulated eight literary and non-literary passages, creating modified versions that had lower word chunk frequencies but higher individual word frequencies than the originals. Subjects were then asked to rate the passages’ quality of writing. Results showed that su...
متن کاملTowards the Automatic Detection of the Source Language of a Literary Translation
Experiments on the detection of the source language of literary translations are described. Two feature types are exploited, n-gram based features and document-level statistics. Crossvalidation results on a corpus of twenty 19th-century texts including translations from Russian, French, German and texts written in English are promising: single feature classifiers yield significant gains on the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015